Block schema field drop if it is reference by an active partition or sort field#2410
Conversation
d4928bd to
81732d4
Compare
Anton-Tarazi
left a comment
There was a problem hiding this comment.
Nice work, left a minor comment :)
81732d4 to
7e47009
Compare
|
@Anton-Tarazi thanks for the review (this PR went off my radar) I have rebased and applied recommended changes |
geruh
left a comment
There was a problem hiding this comment.
Added some suggestions to get this moving for the 0.11 release. These fix the two bugs inverted allow_missing_fields logic, and string interpolation, also, we simplify the code by using existing schema._lazy_id_to_parent
7e47009 to
0a7a39e
Compare
|
Thanks for the review @geruh applied the comments |
geruh
left a comment
There was a problem hiding this comment.
Thanks @gabeiglio this LGTM!!
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM! thanks for the pr
couple of nits but not a blocker
| path = "/".join([field_str + "=" + value_str for field_str, value_str in zip(field_strs, value_strs, strict=True)]) | ||
| return path | ||
|
|
||
| def check_compatible(self, schema: Schema, allow_missing_fields: bool = False) -> None: |
There was a problem hiding this comment.
nit: doesnt look like allow_missing_fields is used anywhere, should we still keep it?
There was a problem hiding this comment.
check_compatible in Java its only used with allowMissingField=true (ctx) when reading metadata tables. Since here we only use this check (for now) at write path only, I think we could remove this field.
5d62d54 to
2a65eb7
Compare
2a65eb7 to
e12f66e
Compare
kevinjqliu
left a comment
There was a problem hiding this comment.
thanks for the fixes! i think we're almost there
| while parent_id: | ||
| parent_type = schema.find_type(parent_id) | ||
| if not parent_type.is_struct: | ||
| raise ValidationError(f"Invalid partition field parent: {parent_type}") |
There was a problem hiding this comment.
maybe a bit more info here too
|
Thanks for the PR @gabeiglio and thanks for the review @jayceslesar @Anton-Tarazi @geruh |
Closes #2166
Rationale for this change
We should block when an user wants to drop a column if that column is being referenced by either a active partition spec or sort order field.
Are these changes tested?
Yes, I added unit tests for every incompatible schema change in partitions and sort orders. Also added two new integration tests in
test_catalogto test for this scenarioAre there any user-facing changes?
No